Global Convergence of Non-Convex Gradient Descent for Computing Matrix Squareroot

نویسندگان

  • Prateek Jain
  • Chi Jin
  • Sham M. Kakade
  • Praneeth Netrapalli
چکیده

While there has been a significant amount of work studying gradient descent techniques for non-convex optimization problems over the last few years, all existing results establish either local convergence with good rates or global convergence with highly suboptimal rates, for many problems of interest. In this paper, we take the first step in getting the best of both worlds – establishing global convergence and obtaining a good rate of convergence for the problem of computing squareroot of a positive definite (PD) matrix, which is a widely studied problem in numerical linear algebra with applications in machine learning and statistics among others. Given a PD matrix M and a PD starting point U0, we show that gradient descent with appropriately chosen stepsize finds an -accurate squareroot of M in O ( α log( ∥∥M−U02∥∥F / )) iterations, where α 4 = (max{‖U0‖22 , ‖M‖2} / min{σ min(U0), σmin (M)}). Our result is the first to establish global convergence for this problem and that it is robust to errors in each iteration. A key contribution of our work is the general proof technique which we believe should further excite research in understanding deterministic and stochastic variants of simple non-convex gradient descent algorithms with good global convergence rates for other problems in machine learning and numerical linear algebra. Appearing in Proceedings of the 20 International Conference on Artificial Intelligence and Statistics (AISTATS) 2017, Fort Lauderdale, Florida, USA. JMLR: W&CP volume 54. Copyright 2017 by the authors.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the Matrix Square Root via Geometric Optimization

This paper is triggered by the preprint [P. Jain, C. Jin, S.M. Kakade, and P. Netrapalli. Computing matrix squareroot via non convex local search. Preprint, arXiv:1507.05854, 2015.], which analyzes gradient-descent for computing the square root of a positive definite matrix. Contrary to claims of Jain et al., the author’s experiments reveal that Newton-like methods compute matrix square roots r...

متن کامل

Computing Matrix Squareroot via Non Convex Local Search

We consider the problem of computing the squareroot of a positive semidefinite (PSD) matrix. Several fast algorithms (some based on eigenvalue decomposition and some based on Taylor expansion) are known to solve this problem. In this paper, we propose another way to solve this problem: a natural algorithm performing gradient descent on a non-convex formulation of the matrix squareroot problem. ...

متن کامل

Global Optimality of Local Search for Low Rank Matrix Recovery

We show that there are no spurious local minima in the non-convex factorized parametrization of low-rank matrixrecovery from incoherent linear measurements. With noisy measurements we show all local minima are very close to aglobal optimum. Together with a curvature bound at saddle points, this yields a polynomial time global convergenceguarantee for stochastic gradient descent ...

متن کامل

Global Convergence of Stochastic Gradient Descent for Some Non-convex Matrix Problems

The Burer-Monteiro [1] decomposition (X = Y Y T ) with stochastic gradient descent is commonly employed to speed up and scale up matrix problems including matrix completion, subspace tracking, and SDP relaxation. Although it is widely used in practice, there exist no known global convergence results for this method. In this paper, we prove that, under broad sampling conditions, a first-order ra...

متن کامل

A Non-Euclidean Gradient Descent Framework for Non-Convex Matrix Factorization

We study convex optimization problems that feature low-rank matrix solutions. In such scenarios, non-convex methods offer significant advantages over convex methods due to their lower space complexity as well as faster convergence speed. Moreover, many of these methods feature rigorous approximation guarantees. Non-convex algorithms are simple to analyze and implement as they perform Euclidean ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017